58 research outputs found
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
Hybrid RL is the setting where an RL agent has access to both offline data
and online data by interacting with the real-world environment. In this work,
we propose a new hybrid RL algorithm that combines an on-policy actor-critic
method with offline data. On-policy methods such as policy gradient and natural
policy gradient (NPG) have shown to be more robust to model misspecification,
though sometimes it may not be as sample efficient as methods that rely on
off-policy learning. On the other hand, offline methods that depend on
off-policy training often require strong assumptions in theory and are less
stable to train in practice. Our new approach integrates a procedure of
off-policy training on the offline data into an on-policy NPG framework. We
show that our approach, in theory, can obtain a best-of-both-worlds type of
result -- it achieves the state-of-art theoretical guarantees of offline RL
when offline RL-specific assumptions hold, while at the same time maintaining
the theoretical guarantees of on-policy NPG regardless of the offline RL
assumptions' validity. Experimentally, in challenging rich-observation
environments, we show that our approach outperforms a state-of-the-art hybrid
RL baseline which only relies on off-policy policy optimization, demonstrating
the empirical benefit of combining on-policy and off-policy learning. Our code
is publicly available at https://github.com/YifeiZhou02/HNPG.Comment: The first two authors contributed equall
Vision Transformers for Single Image Dehazing
Image dehazing is a representative low-level vision task that estimates
latent haze-free images from hazy images. In recent years, convolutional neural
network-based methods have dominated image dehazing. However, vision
Transformers, which has recently made a breakthrough in high-level vision
tasks, has not brought new dimensions to image dehazing. We start with the
popular Swin Transformer and find that several of its key designs are
unsuitable for image dehazing. To this end, we propose DehazeFormer, which
consists of various improvements, such as the modified normalization layer,
activation function, and spatial information aggregation scheme. We train
multiple variants of DehazeFormer on various datasets to demonstrate its
effectiveness. Specifically, on the most frequently used SOTS indoor set, our
small model outperforms FFA-Net with only 25% #Param and 5% computational cost.
To the best of our knowledge, our large model is the first method with the PSNR
over 40 dB on the SOTS indoor set, dramatically outperforming the previous
state-of-the-art methods. We also collect a large-scale realistic remote
sensing dehazing dataset for evaluating the method's capability to remove
highly non-homogeneous haze
Rethinking Performance Gains in Image Dehazing Networks
Image dehazing is an active topic in low-level vision, and many image
dehazing networks have been proposed with the rapid development of deep
learning. Although these networks' pipelines work fine, the key mechanism to
improving image dehazing performance remains unclear. For this reason, we do
not target to propose a dehazing network with fancy modules; rather, we make
minimal modifications to popular U-Net to obtain a compact dehazing network.
Specifically, we swap out the convolutional blocks in U-Net for residual blocks
with the gating mechanism, fuse the feature maps of main paths and skip
connections using the selective kernel, and call the resulting U-Net variant
gUNet. As a result, with a significantly reduced overhead, gUNet is superior to
state-of-the-art methods on multiple image dehazing datasets. Finally, we
verify these key designs to the performance gain of image dehazing networks
through extensive ablation studies
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
We consider a hybrid reinforcement learning setting (Hybrid RL), in which an
agent has access to an offline dataset and the ability to collect experience
via real-world online interaction. The framework mitigates the challenges that
arise in both pure offline and online RL settings, allowing for the design of
simple and highly effective algorithms, in both theory and practice. We
demonstrate these advantages by adapting the classical Q learning/iteration
algorithm to the hybrid setting, which we call Hybrid Q-Learning or Hy-Q. In
our theoretical results, we prove that the algorithm is both computationally
and statistically efficient whenever the offline dataset supports a
high-quality policy and the environment has bounded bilinear rank. Notably, we
require no assumptions on the coverage provided by the initial distribution, in
contrast with guarantees for policy gradient/iteration methods. In our
experimental results, we show that Hy-Q with neural network function
approximation outperforms state-of-the-art online, offline, and hybrid RL
baselines on challenging benchmarks, including Montezuma's Revenge.Comment: 42 pages, 6 figures. Published at ICLR 2023. Code available at
https://github.com/yudasong/Hy
Cysteine Oxidation Reactions Catalyzed by a Mononuclear Non-heme Iron Enzyme (OvoA) in Ovothiol Biosynthesis
OvoA
in ovothiol biosynthesis is a mononuclear non-heme iron enzyme
catalyzing the oxidative coupling between histidine and cysteine.
It can also catalyze the oxidative coupling between hercynine and
cysteine, yet with a different regio-selectivity. Due to the potential
application of this reaction for industrial ergothioneine production,
in this study, we systematically characterized OvoA by a combination
of three different assays. Our studies revealed that OvoA can also
catalyze the oxidation of cysteine to either cysteine sulfinic acid
or cystine. Remarkably, these OvoA-catalyzed reactions can be systematically
modulated by a slight modification of one of its substrates, histidine
- …